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ABSTRACT 

■ When algorithms for sorting and searching are applied to keys that are represented as bit 

. strings, we can quantify the performance of the algorithms not only in terms of the number 

I of key comparisons required by the algorithms but also in terms of the number of bit com- 

parisons. Some of the standard sorting and searching algorithms have been analyzed with 
respect to key comparisons but not with respect to bit comparisons. In this paper, we investi- 
I gate the expected number of bit comparisons required by Quickselect (also known as Find). 

' We develop exact and asymptotic formulae for the expected number of bit comparisons re- 

quired to find the smallest or largest key by Quickselect and show that the expectation is 
^ , asymptotically linear with respect to the number of keys. Similar results are obtained for the 

^ ' average case. For finding keys of arbitrary rank, we derive an exact formula for the expected 

number of bit comparisons that (using rational arithmetic) requires only finite summation 
(rather than such operations as numerical integration) and use it to compute the expectation 
for each target rank. 
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1 Introduction and Summary 

When an algorithm for sorting or searching is analyzed, the algorithm is usually regarded 
either as comparing keys pairwise irrespective of the keys' internal structure or as operating 
on representations (such as bit strings) of keys. In the former case, analyses often quantify 
the performance of the algorithm in terms of the number of key comparisons required to 
accomplish the task; Quickselect (also known as Find) is an example of those algorithms 
that have been studied from this point of view. In the latter case, if keys are represented as 
bit strings, then analyses quantify the performance of the algorithm in terms of the number 
of bits compared until it completes its task. Digital search trees, for example, have been 
examined from this perspective. 

In order to fully quantify the performance of a sorting or searching algorithm and 
enable comparison between key-based and digital algorithms, it is ideal to analyze the algo- 
rithm from both points of view. However, to date, only Quicksort has been analyzed with 
both approaches; see Fill and Janson [Sj. Before their study. Quicksort had been exten- 
sively examined with regard to the number of key comparisons performed by the algorithm 
(e.g., Knuth [12], Regnier [l7j. Rosier [18], Knessl and Szpankowski [9], Fill and Janson [2], 
Neininger and Riischendorf [16]), but it had not been examined with regard to the number 
of bit comparisons in sorting keys represented as bit strings. In their study, Fill and Janson 
assumed that keys are independently and uniformly distributed over (0,1) and that the keys 
are represented as bit strings. [They also conducted the analysis for a general absolutely 
continuous distribution over (0,1).] They showed that the expected number of bit compar- 
isons required to sort n keys is asymptotically equivalent to n(lnn)(lgn) as compared to the 
lead-order term of the expected number of key comparisons, which is asymptotically 2Tilnn. 
We use In and Ig to denote natural and binary logarithms, respectively, and use log when the 
base does not matter (for example, in remainder estimates). 

In this paper, we investigate the expected number of bit comparisons required by 
Quickselect. Hoare [7] introduced this search algorithm, which is treated in most textbooks 
on algorithms and data structures. Quickselect selects the m-th smallest key (we call it the 
rank-m key) from a set of n distinct keys. (The keys are typically assumed to be distinct, 
but the algorithm still works — with a minor adjustment — even if they are not distinct.) The 
algorithm finds the target key in a recursive and random fashion. First, it selects a pivot 
uniformly at random from n keys. Let k denote the rank of the pivot. If A; = m, then the 
algorithm returns the pivot. If A; > m, then the algorithm recursively operates on the set of 
keys smaller than the pivot and returns the rank-m key. Similarly, if A; < m, then the algo- 
rithm recursively operates on the set of keys larger than the pivot and returns the (A; — m)-th 
smallest key from the subset. Although previous studies (e.g., Knuth [10], Mahmoud et al. 
[14j . Griibel and U. Rosier [6], Lend and Mahmoud [13| . Mahmoud and Smythe [15], Devroye 
m, Hwang and Tsai [8j) examined Quickselect with regard to key comparisons, this study 
is the first to analyze the bit complexity of the algorithm. 

We suppose that the algorithm is applied to n distinct keys that are represented as 
bit strings and that the algorithm operates on individual bits in order to find a target key. 
We also assume that the n keys are uniformly and independently distributed in (0,1). For 
instance, consider applying Quickselect to find the smallest key among three keys ki, k2, 
and k^ whose binary representations are .01001100..., .00110101..., and .00101010..., respec- 
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tively. If the algorithm selects as a pivot, then it compares each of ki and k2 to ^3 in order 
to determine the rank of k^. When ki and k^ are compared, the algorithm requires 2 bit 
comparisons to determine that k^ is smaller than ki because the two keys have the same first 
digit and differ at the second digit. Similarly, when k2 and k^ are compared, the algorithm 
requires 4 bit comparisons to determine that k^ is smaller than /c2. After these comparisons, 
key k^ has been identified as smallest. Hence the search for the smallest key requires a total 
of 6 bit comparisons (resulting from the two key comparisons). 

We let fi{m, n) denote the expected number of bit comparisons required to find the 
rank-m key in a file of n keys by Quickselect. By symmetry, ^{m,n) = ^{n + 1 — m,n). 
First, we develop exact and asymptotic formulae for /x(l,n) = ii{n,n), the expected number 
of bit comparisons required to find the smallest key by Quickselect, as summarized in the 
following theorem. 

Theorem 1.1. The expected number fi{l,n) of bit comparisons required by Quickselect to 

find the smallest key in a file of n keys that are independently and uniformly distributed in 
(0, 1) has the following exact and asymptotic expressions: 



The asymptotic formula shows that the expected number of bit comparisons is asymp- 
totically linear in n with the lead-order coefficient approximately equal to 5.27938. Hence 
the expected number of bit comparisons is asymptotically different from that of key com- 
parisons required to find the smallest key only by a constant factor (the expectation for key 
comparisons is asymptotically 2n). Complex-analytical methods are utilized to obtain the 
asymptotic formula. Details of the derivations of the formulae are described in Section [3l 

We also derive exact and asymptotic expressions for the expected number of bit 
comparisons for the average case. We denote this expectation by fj,{m,n). In the aver- 
age case, the parameter m in /i(m, n) is considered a discrete uniform random variable; 
hence fi{m,n) = ^Y^^^i fJ-{'m,n). The derived asymptotic formula shows that fx{rh,n) is 
also asymptotically linear in n; see (|4.48|) . More detailed results for fi{fh,n) are described in 
Section HI 

Lastly, in Section [H we derive an exact expression of fi{m,n) for each fixed m that is 
suited for computations. Our preliminary exact formula for fi{m, n) [shown in (j2.8p ] entails 
infinite summation and integration. As a result, it is not a desirable form for numerically 
computing the expected number of bit comparisons. Hence we establish another exact for- 
mula that only requires finite summation and use it to compute /i(m,n) for m = 1, . . . , n, 




where Hn and Bj denote harmonic and Bernoulli numbers, respectively, and, with Xk •= 
and 7 := Euler's constant = 0.57722, we define 
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n = 2, . . . , 25. The computation leads to the following conjectures: (i) for fixed n, fi{m, n) in- 
creases in m for m < and is symmetric about and (ii) for fixed m, n) increases 
in n (asymptotically linearly). 



2 Preliminaries 

To investigate the bit complexity of Quickselect, we follow the general approach developed 
by Fill and Janson [3]. Let C/i, . . . , C/„ denote the n keys uniformly and independently dis- 
tributed on (0, 1), and let C/(j) denote the rank-i key. Then, for 1 < i < j < n (assume 
n > 2), 



and are compared} 



if m < i 



j — m + 1 

if i < m < j (2-1) 

if j < m. 

, m — i + 1 

To determine the first probability in (12. ip . note that f7(m)) ■ ■ ■ , remain in the same subset 
until the first time that one of them is chosen as a pivot. Therefore, C/(j) and C/q) are compared 
if and only if the first pivot chosen from C/(^), . . . , ?7(j) is either C/(j) or f/(j). Analogous 
arguments establish the other two cases. 

For 0<s<t<l, itis well known that the joint density function of [/(jj and [/(j) is 
given by 

Clearly, the event that f7(j) and C/q) are compared is independent of the random variables 
and [/(j). Hence, defining 

Pi{s,t,m,n) = V -. —fu,^Ur,As,t), (2.3) 

m<i<j<n 

sr^ 2 

P2{s,t,m,n) = 2^ . _ ._^^ /t/(i),C/(,)(g,t), (2.4) 

l<i<m<j<n 

. ^ 2 

P3is,t,m,n) = 2^ ^_j + i -^^w-^b)(^'^)' (2-^) 

l<i<j<m 

P{s,t,m,n) = Pi{s,t,m,n) + P2{s,t,m,n) + P3{s,t,m,n) (2-6) 

[the sums in ()2.3p - (l2.5p are double sums over i and j], and letting P{s,t) denote the index 
of the first bit at which the keys s and t differ, we can write the expectation /i(m, n) of the 
number of bit comparisons required to find the rank-m key in a file of n keys as 

^{m,n) = P{s,t)P{s,t,'m,n) dt ds (2-7) 

Jo Js 



y V / / {k + l)P{s,t,m,n)dtds; (2.5 
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in this expression, note that k represents the last bit at which s and t agree. 



3 Analysis of fi{l,n) 

In Section [3.H we derive the exact expression for /x(l,n) shown in Theorem 11.11 In Section 
13.21 we prove the asymptotic result stated in Theorem II. 1[ 

3.1 Exact Computation of fx{l,n) 

Since the contribution of P2{s,t,m,n) or P3(s,t,m,n) to P{s,t,m,n) is zero for m = 1, we 
have P{s,t,l,n) = l,n) [see through ^M)]. Let x := s, y := t - s, z := 1-t. 

Then 



l<i<j<n ' 

n 



l<i<j<n, 



I ^ -i-l,l,n- jj 



oo 



2z" / ?7-"-^n(n -l){x + y + r/)""^ df] 



oo / J. \ "--2 



= 2z"n(n - 1) y 7]-^ (J- + ij drj. (3.1) 

Making the change of variables v = ^ + 1 and integrating, and recalling z = 1 — t, we find, 
after some calculation, 



Piis,t,l,n) = 2E(-l)^Qt^-2. 



(3.2) 



J=2 

From (I2SD and (lOl . 

2* <.(i-i)2-'= /.Z2-'= 



/x(l,n) = V(A: + 1)V/ ^ / ^ Pi(s,t,l,n)a!tds 

= 2^(^ + 1)^/ / , E(-l)n )*'"^*^^ 

fc=0 i=l 7=2 ^"'^ •^V'~2 



2* n 



t-'-'[(Z-i)2-''-(/-l)2-'=]dt 

)2-'= 



+ 1) E E ^^2-'={(/2-^')^-i - [(/ - \)2^'r'} 

k=0 1=1 j=2 •' 

E E(^ + 1)2-'^' Et^^"' - - \y'"\- (3-3) 

j=2 A:=0 l=\ 
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To further transform (j3.3p . define 



r Brfj-l 



r \r — 1 

1 

2 

1 



if r > 2 

if r = 1 
if r = 0, 



(3.4) 



wliere Br denotes tlie r-tli Bernoulli number. Let Snj '■= X^JLi • Then Snj = ^r=o '^i. 
(see Knuth [12]), and 



2k 2* 

_ (; _ i).-i] = 52.,,- - 2-(^-i) ^^(2/ - 1)^-1 
i=i 1=1 
= S2k j — 2 ^\S2k+i J — 2-' ^ S2k j) = 2S2k j — 2 "'"''S'2fe+i J- 

i-i i-i i-i 

= 2 ^ aj^r^''^^-''^ - 2-(^-^) ^ aj,,2('^+i)(j'~") = 2 ^ a^- ,2'=(j'-'')(l - 2~''). (3.5) 

r=0 r=0 r=l 



From (I33D and ([33]) . 

//(l,n) 



2 ^ i-AAlZ ^(fc + i)2-'=i a,„2^(^-^)(l - 2- 
i=2 ■ - 



fc=0 



r=l 



Here 



Hence 

/i(l,n) 



oo J — 1 CO j — 1 

Y,ik + 1)2-''^ aj,r2''^^-''\l - 2-') = Y,{k + 1) aj,r2-'^^(l - 2" 

k=0 r=l k=Q r=l 

j-l oo j-1 

= ^aj>(l-2-^)^(A: + l)2-^''' =^aj>(l-2-'')-^ 

r=l k=0 r=l 



1 

j=2 •> r=l 



r=l 



jr=r+l 



i=2 



(-iy(-) 



r=2 



-1 ( -*-)"' (j) (r-l) 

r 7 — 1 

j=r+l 



2E-3i^+2j:(i-2-T^i^ 



i=2 



r=2 



'^(-ip©rD (-1)^0 



j-l 



r - 1 



(3.6) 
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To simplify X]"=.r j_i 

n / 

E 



, note that 



3=r 



{n — r) 



j J \r — 1 J (n — r)!r! j{n — — r)l 



J-2 



n 

r I I z 

r 



-2 



n — r \ 



3 - rj 3 



7.0 



j=0 



n — r\ z- 



n 

r\ \z 

. r . 



j=0 



n — r 



dC = r 



n 



3 J 3 +r 



-2 / Ar-l 



Thus 



^(-iy(-)ri) f 



3-1 



E 



3 J \r -1 



dz 



-1 



u'-\l - u) 



n—r+l 



du 



(-IY M r(r - l)r(n - r + 2) _ - r + 1) 

^ '^Vrj r(n+l) ~ r-1 

Plugging (j3.7p into (|3.6p and recalling i?2fc+i = for k > 1, we finally obtain 



(3.7) 



n-l 



E^T^ + 22:(l-2-')-':?l 



i=2 



r=2 
n-l 



(-ir(n-r+i) (-irc) 



r - 1 



r - 1 



^ i-1 ^j(j-i)(i-2-J) 



= 2n{Hn-l)+2tn, 
where Hn denotes the n-th harmonic number and 



tn =Y1 



J=2 



n 



(•) 



i-1 



(3.8) 



(3.9) 



The last equality in (j3.8p follows from the easy identity 

k =^-- 

k=l 
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3.2 Asymptotic Analysis of 

In order to obtain an asymptotic expression for n), we analyze t„ in (|3.8|) - (j3.9|) . The fol- 
lowing lemma provides an exact expression for t„ that easily leads to an asymptotic expression 
for fj,{l, n): 

Lemma 3.1. For n>2, let Un '■= ^n+i — tn {with t2 = 0) and Vn '■= Vn+i — Let 7 denote 
Euler's constant (= 0.57722), and define Xk ■= Then 

-, Hn + 2 



where 



+ 



In 2 



(_I 1 

Vln2 2 



fcez\{o} 



n + 1 (n + l)(n + 2) 

C(l-Xfc)r(n + l)r(l-Xfc) 



(ln2)r(n + 3-Xfc) 



(ii) 



where 



-Hn + a 



n+l 



+ 



7-1 1 



1 



E 

kez\{o} 



(ln2)(n + l) \ln2 2j n + l 

C(l-Xfc)r (l-Xfc) 

r(4-xfc)( 
C{i - xkW - Xk) r(n + i) 



14 17 - 67 

"9" 18 In 2 In 2 ^ r(4 - Yfc)(l - Xk) 
kez\{o} ^ ^ ' 



— y 



(ln2)(l-xfc) r(n + 2-Xfc)' 



where 



and Hn^ denotes the n-th Harmonic number of order 2, i.e., Hn' ■= X^"=i ji- 



-{nHn — n — 1) + a{n — 2) 



1 



2 In 2 



^ ^(2) _ 7 

n ' rt 2 



E 

fcGZ\{0} 

E 

fcGZ\{0} 



2C(i - Xfc)r(-xfc) 

(In2)(l-Xfc)r(3-Xfc)' 

C(i-Xfc)r(-Xfc)r(n + i) 

(ln2)(l-Xfc)r(n + l-Xfc)^ 



r(2) ._ v-n 1 
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In this lemma, Un and Vn are derived in order to obtain the exact expression for t„ in (iii). 
From (jS.Sp . the exact expression for tn also provides an alternative exact expression for 

Before proving Lemma I3.H we complete the proof of Theorem 11.11 using part (iii) . 
We know 



Hji = In n + 7 + 



1 1 



2n 12n2 



+ 0(n" 



-3\ 



6 n 2n'^ 



(3.10) 
(3.11) 



Combining (|3.10p - (l3.1ip with (|3.8p and Lemma ISTTT iii) . we obtain an asymptotic expression 
for //(I, n): 



/i(l,n) = 2an- -^{Innf - ( -^ + I]lnn + 0{1). 
in 2 V In 2 



(3.12) 



The term 0(1) in (j3.12p has fluctuations of small magnitude due to S„, which is periodic in 
logn with amplitude smaller than 0.00110. The asymptotic slope in (|3.12p is 



28 17-67 4 
c = 2a = — + J - — 2^ 



C(i-xfc)r(i-xfc) . 
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5.27938. 



(3.13) 



Now we prove Lemma |3. II 
Proof, (i) Since 

n „ 

j(l - 2-il 



(n + 1) - r+^) 



E 



i=2 

it follows that 



j(j-l)(l-2-J) 



n 



- 1 



n-l „ 

^ j(l-2-j 



J=2 



n 



(•) 



i-1 



Un+l - Un 



n+l 

E 

i=2 



i(j -l)(l-2-i) 



n + l 
j-1 



n.+l 

E 

i=2 



j(i-i)(i-2-^) 



n 



n-l 

-E 

fc=0 
n-l 



n 



B 



k+2 



kj (A; + 2)(A; + l)[l-2-(fc+2)] 

Ci-i-k) 



k=0 
(-1) 



)[1 _2-(fc+2)] 
C(-l-s) n! 



2vri (s + 1) [1 - 2-(^+2)] s(s - 1) • • • (s - n) 
where C is a positively oriented closed curve that encircles the integers 0,. . . , n — 1 and does 

2iTik 
In 2 



(3.14) 
(3.15) 



not include or encircle any of the following points: —2 + Xk (where Xk ■= i^)) k ^Z,; —1; 
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and n. Equality (j3.14p follows from the fact that the Bernoulli numbers are extrapolated by 
the Riemann zeta function taken at nonnegative integers: Bk = —kl^{l — k). [The coefficients 
(— 1)'^ do not concern us since the Bernoulli numbers of odd index greater than 1 vanish.] 
Equality (|3.15p follows from a direct application of residue calculus, taking into account 
contributions of the simple poles at the integers 0,. . . , n — 1. 
Let denote the integrand in (j3.15p : 

C( — 1 — s) n\ 
"^^""^ = (s + l)[l-2-(«+2)]s(s-l)---(s-n)' 

We consider a positively oriented rectangular contour Ci with horizontal sides Im(s) = A/ and 
Im(s) = —A;, where A; := ^^i^ , ^ G and vertical sides Re(s) = n — 9 and Re(s) = —A;, 
where < 9 < 1. By elementary bounds on (j){s) along Ci and the fact that 

n—9+ioo 

(j){s)ds = (3.16) 

n—9—ioo 

(this is implicit on page 113 of Flajolet and Sedgewick ^ and explicitly proved in the Ap- 
pendix), one can show that 



lim / (/)(s) ds = 0. 



i- 

Accounting for residues due to the poles encircled by Ci, we obtain 



Vn = (-1) 



n+1 



Ress=_i[(?:)(s)] + Ress=_2[(/'(s)] + ^ ReSs=-2+Xk[4'{s)] 

kez\{o} 



-I Hn + 2 _ / J 1 

, ln2 Vln2 2 



n + 1 (n+l)(n + 2) 

where 



+ , J - Sn, (3.17) 



^ (ln2)r(n + 3-Xfc) ' ^ ^ 



□ 



(ii) We have U2 = H — t2 = t^, = Hence, from (i). 



n— 1 ^ ra— 1 

i=2 i=2 

^ n— 1 ^ ^ rt— 1 / -1 \ 1—1 , n— 1 

9 + 1 ln2^(j + l)(i + 2) \\n2 2 J {j + + 2) j^^ ■ 

~l ~ " ""^^ + h^ % {J + 2) - (i^ - 1) 

14 / ^ 1 \ 1 1 ^ n—l 

"9""3h^"^"+Vl^~2j ^ + ii^ 5 (i + + 2) " 5 ^^'^ ^^-^^^ 



i=2 "-^ ' i=2 
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Here 



n-l 

E 

J=2 



i+2 



(j + l)(j + 2) 



rr ri+1 „ 

i=3 i=4 



J =4 

17 Hn + l 

18 n + 1 ' 



n+l 



n + 1 



(3.20) 



(3.21) 



where we assume n > 3 for (I3:20]l . but (f3:2T|l holds also for n = 2. In regard to X;?=2 ^ 



J' 



note that 



E 

kez\{o} 



C(i-xfc)r(i-xfc) 

(ln2)(l-xfe) 



r(n + 2) 



r(n + i) 



r(n + 3-Xfc) r(n + 2-Xfc) 



so that 



E^. = - E 



J=2 



fcez\{o} 



C(i-xfc)r(i-xfc) 

(ln2)(l-xfc) 



r(3) 



r(n + i) 

r(n + 2-Xfc) " r(4-Xfc)J 



Define 



C{i-xk)ni-xk) r(n + i) 
.."km r(n + 2-x.)' 



(3.22) 



(3.23) 



Then, combining ([339|) . ([3:2T|) . and (l3:22]l . we obtain 



-i/„ + a 



n+1 



+ 



7-1 1\ 1 



(ln2)(n + 1) V ln2 2/ n + 1 



+ E, 



where 



._ 14 17-67 2 ^ c(i-xfc)r(i-xfc) 

9 181n2 ln2 ^ r(4 - Xfc)(l - Xfc) ' 

fcGZ\{0} ^ ^ ' 



(iii) Closely following the derivation of u„ described above, we obtain (for n > 2) 

n— 1 n— 1 

in = t2 + ^Uj = Uj 
j=2 3=2 
n—\ ^ " U 

= -Effi+<.("-2)-n^Ef + 

i=2 j=3 



(3.24) 
□ 



7-1 1 
In 2 ~ 2 



n-l 

2y+E% 

i=2 



-{nHn — n — 1) + a{n — 2) 



1 



2 In 2 



+ 



7-1 1 

In 2 ~ 2 



Hr, 



(3.25) 
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where 

h ■= 

(In2)(l-Xfc)r(3-Xfc) 

C(i-xfc)r(-xfc)r(n4 

(In2)(l-Xfc)r(n + 1-Xfc 



2C(i - Xfc)r(-Xfc) „„^ 

fcez\{o} 

^ C(i-xfc)r(-xfc)r(n + i) 

fcGZ\{0} 

□ 



4 Analysis of the Average Case: /i(m, n) 
4.1 Exact Computation of ij,{m,n) 

Here we consider the parameter m in /u(m, n) as a discrete random variable with probabihty 

mass function P{m = i} = —, i = 1,2, . . . ,n, and average over ni while the parameter n is 

n 

fixed. Thus, using the notation defined in (j2.3p through (j2.7p . 

n „i „i 




ji{7fi,n) = — /x(m, n) = — / / (5{s,t)P{s,t,m,n) dtds 

^ m=l ^ m=l 

= 11 (3{s,t) — } P(s,t,m,n) dt ds = fii(m,n) + ij,2(m,n) + iJ,s(m,n), 
Jo Js n f-^^ 



m=l 



where, for Z = 1, 2, 3, 

"1 ri 



Hl{m,n) 



/ / (3(s,t) Pi{s,t, 771,71) dtds. (4.1) 

■'^ -"^ m=l 



Here fj,i{m,n) = ^^{771,71), since 

^3(1 -t',l- s',n- m' + l,n) = Pi{s' ,t' ,771' ,n) 

by an easy symmetric argument we omit, and so 
1-1 /.I 

/X3(m,n 



/ / P(s,t) — y Ps{s,t,m,n) dtds 

/ / I3{l-t\l- s') P-i{l-t',l- s',7i-77i' + l,7i) dt' ds 

JO Js' n 

/ / f3(s',t') - y Pi(s',t',m',n) dt' ds' 

JO Js 



Hi{m,n} 



Therefore 



fj,{m,n) = 2fii{m,n) + fX2{m',n), (4.2) 
and we will compute fii{m,n) and ^2['fn,7i) exactly in Sections 14.1.11 -2. 
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4.1.1 Exact Computation of fii{m,n) 

We use the following lemma in order to compute iii{fh,n) exactly: 
Lemma 4.1. 



/ / /?(s, t) — Pi{s^ t, m, n) dt ds 

J J S ^ o 



m=2 

jO'-I) 9^t^ j-1 ^j-(j-_i)(j-_2)(l-2-^) 



-2E ' 



,..2(i + i)j(j-i)(i-2-')' 

Before proving the lemma, we complete the computation of ^i(m, n). Note that 
Is 



I I 1 v— -\ 

lii{rh,n) — j j t) — Pi{s, t, m, n) dt ds 

= - f ( P(s,t)Pi(s,t,l,n)dtds+ [ [ P(s,t)-y^ Pi(s,t,m,n)dtds 
nJo Js Jo Js n 

1 1 v-^ 

= — /u(l,n)+ / / p(s,t)—y Pi{s,t,m,n) dt ds. 
n Jo Js ^ 



m=2 



Therefore, by (j3.8p and Lemma l4.ll we obtain 

, 2^ (-iy(-) , 2!^^ + 

nT{m,n) = — > ^ > B^— 

^ j-1 ^j(j-_i)(i_2-J) 



^.=2 -^'^J'-I) ' ^'"^ ^j(i-l)(i-2)(l-2-i) 



^.^2 (i + i)i(j-i)(i-2-^) 



^ j(j-l)(j-2) ^j(j-_i)(i_2-^) 



9,t^ J-1 ^j0--l)(j-2)(l-2-i) 

(j + i)i(j-i)(i-2-J)' 



_2y- — ; ^ ^ — -, (4.3) 



J=2 
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where the second equahty holds since 



i=2 i=2 



" (-iy(n-l)! " (-iy(n-l) 



^.^2 ^-(^ - - 1) (j - i)Kn - J)!(i - i)(i - 2) 



,t^(j-l)!(n-j)!0--l) 



1 1 



i J - 2. 



n 



.^3i(j-i)(j-2)- 



In Section r4.1.2l we combine the expression for /ii(m, n) in (j4.3p with a similar expres- 
sion for ii2{m,n) to obtain an exact expression for fi{fh,n). The remainder of this section is 
devoted to proving Lemma |4.1[ For this, the following expression for Pi{s, t, m, n) will prove 
useful: 

Lemma 4.2. Let m > 2 and let x := s, y := t — s, z := 1 — t. Then the quantity Pi{s,t,'m,n) 
defined at l\2.3^ satisfies 

Pi{s, t, m, n) 

1 

(4.4) 

where 

T^{m,n,^,x,y,z) := \) {x - O'^-^n - m){C + y + zT'^^', 

\m — 2J 

T2{m,n,C,x,y,z) := \) (x - O'^-'in - m + l)z{^ + y + zT'"' , 

T3im,n,^,x,y,z) := f " !^Vx - 6™" V'^+i. 

\m — I) 

Proof of Lemma\4^ By (iOl-dOll. 

Pi{s,t,m,n)= ■ \w nir nir x'-^-'-'z--^ 



7 — m + 1 (n — m — 1)! — 7 — ^ — 1, ''^ — 7 / — 1)' 
2n] v-^ 1 / n — m — 1 \ (i — m)\ 



E 



(n — m — 1)! j — m + l\i — m, j — i — l,n — j J (i — 1)1 



(4.5) 
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In order to compactly describe the derivation of (j4.4p . we define the following indefinite 
integration operator T: 



T{f{x)):= / fiOdt 
Jo 

We really should write {Tf){x) rather than T{f{x)), but we would like to use shorthand such 



as T{x^ 



i+1 



when j > —1. The operator T treats its argument / as a function of x; the 



other variables involved in / (namely, y and z) are treated as constants. The notation 
will denote the l-th iterate of T. In this notation, for m < i, 



[t — m}\ 



and the sum in (j4.5p equals 



T— M y — - — ( 

\ ^ j -m+l\ 

\rn<i<j<n ^ 



n — m — 1 
i — m, j — i — l,n — j 



^i-ruyj-i-l^n-j 



Here 



j — m + 1 



-(j-m+l)-l 



drj, 



so 



I ^—^ j — m + 1 

\m<i<j<n 



n — m — 1 
i — m,j — i— l,n — j 



dr] 



rpm—1 ( / 3 ' 



7]-' - + l 

V 



n— m— 1 



dr] 



(4.6) 



(note that x + y = t). Making the change of variables v = ^ + 1 and integrating, we obtain, 
after some computation, 

n— m— 1 



V " I - + 1 



dr] 



t'^{n — m + l)(n — m) 



{n — m) [1 -\ — 



n—m+l 



(n-m + l)(l + -) +1 



(4.7) 
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From (113]) and (^TM -K7\i. 



Pi{s, t, m, n) 



— T'"-! (t-2[(n -m){z + _ (n - m + l)z{z + t)"""" + 



(n — m + 1)! 

(4.8) 

Here 

t-2[(n-m)(z + t)"-'"+^ - (n-m + l)z(z + tf-"^ + z"-"^+i] = ^ f -2T(m, n, r, z), 

r=2 

(4.9) 

where 

n—m+l—r ( m ^ \ I \ ^n~m+l—r 



T{m,n,r,z) := {n-m)\^ ^ J - (n - m + 1) ^ Jz"-™+^-^ (4.10) 
Then, since t = x + y, 

n—m+l n— m+1 r— 2 ^ \ 

^ f-^T{m,n,r,z) = ^ T(m, n, r, z) ^ T T jx^y'^-^-^'. (4.11) 

r=2 r=2 j=0 ^ ^ 

From gSD^dHI]), 
Pi(s, m, n) 



2n! 


(n 


-m + 1)! 




2n\ 


(n 


-m + 1)! 




2n! 


(n 


-m + 1)! 



(n-m+l r-2 ^ _ oX 

^ T(m,n,r,z)j;rT 
r=2 j=0 V J / 

n—m+l 2 ^ „\ 

J] T{m,n,r,z)Y,(~ ) y^-'-^T— i(x^) 



r=2 j=0 
n—m+l r—2 



r=2 j=0 

(4.12) 

Because of the partial fraction expansion 

1 _ 1 g(-l)'(--2) 



(j + 1) • • • (j + m - 1) (m - 2)! ^ j + ^ + 1 ' 
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it follows that 

r-2 



3=0 ^ •' 



.j+m—1 



(j + 1) • • • (j + m - 1) 



r-2 

E 

3=0 



r-2 



(m-2)! ^ j + / + l 



(m — 


2)! 


1 




(m — 


2)! 


1 




(m — 


2)! 



m-2 

E(- 




1=0 




m-2 

E(- 


-iy( 






/•X 

/ 

JO 


-e) 



m — 2 
I 

m — 2 
I 



X 



m-2- 



[•X ^ 2 

7 «T 



r-2 



X 



3=0 

m-2-l I cUc I „.\r-2 







(4.13) 



From gUD-dHS]), 

Pi{s,t,m,n) = 



2n\ 



n—m+l 



{n — m + l)!(m — 2)! 



r=2 ^0 



2n 



2n 



n — 1 
m — 2 



),.x n—m+l 
/ T(m,n,r,z)(x-er-'(e+yr2de 
■^0 r=2 



n-i\ rix-^y 



_2 n—m+l 



m-2j Jq (e + vY 



Y T{m,n,r,z){^ + yYd^. (4.14) 



■r=2 



Here, by (f4T0]l . 

n—m+l 



^ T(m,n,r,z)(C + y)' 



r=2 



n—m+l 

E 

r=2 



(n — m) 

n—m+l 



n — m+l 
r 



n—m+l—r 



(n — m+l) 



n — m 



n—m+l—r 



(n — m) 



r=2 



n — m+l 



r n—m+l—r 



(n - m)[(C + y + z)"-"+^ - - (n - m + l)(e + y)^"""] 



^^— m+l 

(n — m+l) Y^ 

r=2 

n-m+1 ^n-m+1 _ _ ^ + i) + 

-(n - m + l)z[(^ + y + z)""™ - z"""^ - (n - m)(^ + y)^"-"-^] 
= (n - m)(e + y + 2)"-"+^ - (n - m + l)z(e + y + z)"""" + 

Substitution of (|4.15p into (I4.14j) gives the desired ([44]) . 



n — m 
r 



(e + y)^ 

(C + 



(4.15) 



□ 
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Proof of Lemma \4-l\ From Lemma 14.21 we have 
1 " 

— y Pi(s, t, m, n) 
n ^-^ 

m=2 

•^0 U + 

(4.16) 

Here 



^ Ti(m, n, ^, x,y,z) = {£, + y + zf — 



m=2 



.m=2 



m — 2 



= i^+y+zf {-w-'[{x - e + wr-' - (x - e)"-^] + w~\n - i)(x - e + «^)"-'} 1=^+,+, 

= (x-Cr-'-l + (n-l)(C + y + z) (4.17) 
(note that x + y + z = 1). Shuilarly, 



^ T2(m, n, ^, x,y,z) = z — 



m=2 



m=2 



m — 2 



m— 2 n— m+l 



w=^+y+z 



z{n - 1), 



z[(n-l)(x-e + u^)"-^]L=,^^+, 

(4.18) 



and 



^ T3(m,n,e,x,y,z) = (^_2) 

m=2 m=2 ^ ^ 



(x - i)'^-'z 



m—2 n—m+1 



{x-i + zT-^-{x-i) 



\n-l 



m=2 

Hence 



^ [Ti(m, n, x, y, z) - T2(m, n, x, y, z) + Ts{m, n, ^, x, y, z)] 

= (n - l)(e + y) - 1 + (x - e + z)""^ (4.19) 



m=2 
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Therefore, from (j4.16p and (j4.19p . we obtain 

- t, m,n) = 2 r^^[{n - 1)(^ + y) - 1 + (x - ^ + z)^-'] 

= 2 [j^^ii^ - m + y) - 1 + [1 - + yW-'} di 

We complete the proof by using (^20]) to compute JqJ,(3{s, t) i J2m=2 Pii^, "i, ™) ds dt. 
We have 

-1 rl 



n 1 v— -\ 
/3(s,f) — > Pi{s,t,m,n) ds dt 



m=2 

2/;A(m)eW("- 7'"-'!-""' .>.^. 



Js 



'1 fi 



= 2 p{s,t) Y.i-'^y 

Jo Js 

Closely following the derivations shown in ()3.3p - ()3.8p . one can show that 

Thus, in order to complete the proof, it remains to show that 

' 0(M) '-^'^'T' ., (4.23) 



4 ANALYSIS OF THE AVERAGE CASE: n{m, n) 



Indeed, we have 

i=2 



fc=0 " ^ z V ■ ' 

fc=0 i=2 



j J Jo J - 1 yp-C'+i)-);] VO 



Here 



(2-'=-.)A2-('=M) ^ . ^ if0<t;<2-('^+i) 
[2-(.+i)-.] vo - I 2-^ - ^; if 2-('=+i) < ^; < 2"^ 



Thus 



2-%i-l ^(2-*-t,)A2-('=+l) ;L 

(is dv 



J - 1 7[2-(fc+i)-i;]V0 

2-fc(j+i)(-x _ 2-J) 
(j + - 1) 

From (ji:2i|) and ([06]) . we obtain 

™l „l n-l 



2-(fc+l) 



2-(fc+i) 



J=2 

oo n— 1 



1\ 2-'=0+i)(l - 2" 



OO At — 1 / -J \ 



j J (j + - 1) (1 - 2-^)2 

(-lyr-') 



i=2 

n-l 1\.7V"— 1 

§(j + i)j(i-i)(i-2-^r 



and (j4.23p is proved. 



4 ANALYSIS OF THE AVERAGE CASE: n{m, n) 



20 



4.1.2 Exact Computation of ;U2(m,n) and fi{m,n) 

The derivations for obtaining a computationally preferable exact expression for /i2(w,, ^i^) are 
entirely analogous to those for /ii(m, n) described in the previous section (Section I4.1.ip . 
Thus we omit details. As described in Section [3.11 P2{s,t,m,n) is zero for m = 1 and for 
m = n, so, from (j4.ip . 

»i »i ^ n-l 

fj.2{fh,n) = (3{s,t) P2is,t,m,n) dtds. (4.28) 

Therefore we first derive a computationally desirable expression for ^ X]mJ2 ^2(-s, t, m, n). 
Again, let x := s, y := t — s, z := 1 — t. Then 



^ 71—1 

— 7 P2(s,t,m,n) 



m=2 



n-l 

n 



n i — i + 1 \i — 1,1, j — i — lA,n — j 

n— 1 , n— 1 ^ n~l 



E 51(771, ri, X, y, z) - i S2{m, n, x,y,z) - - S^im, n, x, y, z), (4.29) 



n — ' n — ' n 

m=2 171=2 m=2 



where 



Si{m,n,x,y, z) := ( ) 

S2{m,n,x,y,z) := V -. — r ( . ^ ^ . ^ ^ ^ .) 

^-^ 1 — I + 1 \i — 1,1, 1 — I — l,l,n — T / 



m<i<j<n ' 

n 



Ss{m,n,x,y,z) := ^ — "T • ii- • 11 .) x^-^-'-'z''-^ . 

Fill and Janson [3] showed that Si{m,n,x,y, z) = 2 ^"^2(~1)'' (") ~ s)-^~^. Hence 

- J2 Slim, n, X, y, z) = j^^-iy {t - sy-\ (4.30) 



m=2 j=2 



Following the derivations shown in (j4.5p through (j4.20p . one can show that 

n-l 
m=2 

2(t - s)''s{[l - (t - s)]"-^ - 1 + (t - s)(n - 1)} 



^ n-l 

- y 52(m,n,a;,y,z) = 2y-2a;[(2; + ^)"-i _ 1 + y(„ _ 1)] (4.31) 



m=2 

m-l 



/ _ 1 \ 
j=2 \ J J 
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To obtain a similar expression for Yln7=2 'S'3("^) iT', x, y, z), we note that, letting m' := n + 
1 — m, i' := n + 1 — j, j' := n + 1 — i, 

m'<t'<j'<n 

= S2{n + 1 — m,n, z,y,x). 

Thus 

n— 1 , 1 



- 53(m,n,a;,y,z) = - S'2(n + 1 - m, n, y, x) 

m=2 
^ n— 1 

- 'S'2("i, n, z, y, x). (4.33) 



n ^ — ' n 

m=2 m=2 

n-1 

n 



m=2 

Inspecting (lOB - fOSl) . we find 

n— 1 n— 1 



1 / 1\ 

- ^ 53 (m, n, X, y, = 2(1 - t) ^^(-l)-'- T T (t - s)^-^. (4.34) 



m=2 j=2 

From (ICTD . and (ICTI) . 

n— 1 rvN n / \ ra— 1 



" m=l j=2 y j=2 \ J / 

-2(l-t)|;(-l)^("-')(t-.)^-2 



i=2 



i=2 ' 3=2 

n—l / 

• / n — 1 



i=2 

Hence, from (jOSj) and (fi35]) . 

2(n-2) 



1 1 Pis,t)p^i-iy(^yt-sy-'dtds 

-2 Pis, t) Y^i-iy ^ ^) - ^)'"' 
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Fill and Janson [3] showed that 
A careful term-by-term inspection of the derivations shown in (j4.24p - ()4.27p reveals that 

J— ■'^ J— ^ 

Combining (|4.36p - (|4.39p . we obtain 



. E + + - 2-««)l - - i)[i - ^ " 

= V- o 1)1 +2(^-1). (4.40) 

Finally, we complete the exact computation of fi{m,n). From (|4.2p . (j4.3p . and (j4.40p . 
we have 

IJ,{rh, n) = 2/ii(m, n) + ^2(?^, ^t-) 

9^t^ J-1 -l)(i-2)(l-2-i) A.(j + i)j(j-_i)(i_2-.) 

_1 y \^ ' + 2(n - 1). (4.41) 

We rewrite or combine some of the terms in (I4.4ip for the asymptotic analysis of fJ.{rh, n) 
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described in the next section. We define 



Foin 



F^in 



Fa in 



Fdn 



{3-m-2r 



E 

j=3 

n-1 

g(-iy(V) 

j=2 
n-1 



n 



(•) 



i-1 



j-1 



E 
E 

i=3 



i(j-l)(l-2-J) 



--i-(ri) 

J -2 



(-iy(-) 



j(j-l)(j-2)[l-2'0-i)]' 



The second, third, fourth, and fifth terms in (j4.41|) can be written as —^Fi{n), ^F2{n), 
|i<3(n), and — 4^4(71), respectively. The last three terms in (I4.4ip can be combined as follows: 



n-1 



-4E 



+ 2(n - 1) 



4 " (-iy(-) 4 " 

" (i - 1)(J - 2)[1 - 2-0--1)] n j{j - 1)[1 - 2-a-i)] 



+ 2(n- 1) 



-E 



(-l)^(-) 



'^,^j(i-l)(i-2)[l-2-0-i)] n 



Therefore 



H{rh,n) = 2(n - 1) - fFi(n) + ^FaN + |i^3(n) - 4F4(n) + ^F^in). 



(4.42) 



4.2 Asymptotic Analysis of ^(m, n) 

We derive an asymptotic expression for /x(?fi, n) shown in (j4.42p . The computations described 
in this section are analogous to those in Section [3.21 Hence we merely sketch details to derive 
the asymptotic expression. First, we analyze Fi(n). A routine complex-analytical argument 
similar to (but much easier than) the one described in Section [3.21 shows that 

2 

Fi(n) = (-l)-+i^Res, 



k=0 



s(s-l)2(s-2)2(s-3)---(s-n) 



(-1) 



n+l 



_-[)n ( — 1)'"' ( 5 

-J- + (-l)"n/7„_i + ^^n{n - 1) - - 



--n{n - l)Hn-2 + -^n{n - 1) - nHn-i - - 



1 



-n lnn + 



5 7 
4 ~ 2 



n — n In n + 



n 



2{n - 1) 



(7 + l)n + 0(l). (4.43) 
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Since F2{n) is equal to t„, which is defined at (j3.9p and analyzed in Section [3121 we 
already have an asymptotic expression for F2{n). Next we derive an asymptotic expression 
for F^[n): 



Fs{n) = (-lf^Res,=J- 

; — n ^ ' 



(n- 1)! 



s(s-l)2(s-2)---[s-(n-l)] 
+ 2 

nlnn + (7 — l)n — Inn + 0(1). 



k=o 

nHn-2 - n - Hn-2 + 2 



(4.44) 



To obtain an asymptotic expression for F4{n), we closely follow the approach of 
Section [321 Let it„ := ^4(11 + 1) - F^n). Then 



5, 



j=3 



j(i-i)(i-2)(i-2-^) 



/ n 













Let Vn := "Un+i — "Un- Then, by computations similar to those performed for w„ in Section [3. 2[ 

n-2 



C(-2-fe) 



k=0 



(fc + 2)(A; + l)[l-2-(fc+3)] V ^ 



n — 1 



-l)-+i J]Res,=_J- 
fe=i 



C(-2-s) 



(n-1)! 



+(-!)"+! Res.=_3+;,, 

fcez\{o} 
1 1 1 



s + 2)(s + 1)[1 - 2-(«+3)] s(s - 1) • • • [s - (n - 1)] 
C(-2-s) (n-1)! 



(s + 2){s + 1)[1 - 2-(^+3)] s{s-l)---[s-{n- 1)] 

7 1 Hn+2 



9n n(n + 1) n(n + l)(n + 2) 



ln2 2 n + 2 



where 



fcez\{o} 



C(i-xfc)r(i-xfc)r(n) 

(ln2)r(n + 3-Xfc) 



Hence 



n-1 

-Hn-l + « + In - „, „ , , . 

9 imiVn n+1 



1 fHn Hn+i\ ^1 3 + ln2-27 1 

n 4 In 2 n(n + 1) ' 



where 



7 41 7 ^ ^(i_;^^)r(i-xfc) 



E 



36 In 2 72 12 In 2 ^^^^^^ (In 2)(2 - Xfc)r(4 - Xfe) ^ 
1^ ^ C(l-Xfc)r(l-Xfe)r(n) 
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Thus 



n-l 



i=2 

1 8 / 

^ 1 iJ„ _^ 3 + In 2 - 27 1 



n 



9 8 In 2 



3 + In 2 - 27 
81n2 



21n2 n 



41n2 n' 



2S + 6 - C„ 

(4.45) 



where 



E 

fcGZ\{0} 

E 

fcGZ\{0} 



C(i-xfe)r(i-xfc) 



(In2)(2-Xfc)(l-Xfc)r(3-Xfc)' 

C(i-xfc)r(i-xfc)r(n) 

(In2)(2-Xfc)(l-Xfc)r(n+1-Xfc)' 



Therefore 



-F4('^) = ^nlnn + ( a + ^7 - M n + ^ Inn + 0(1). 
9 V 9 9 / 9 



(4.46) 



Finally, we analyze F^{n). By computations that are entirely analogous to those 
performed for Fi{n), F2{n), and F4^{n), 



F.in) 



2 r 

(-1)"+! J]Res,=J 

fc=0 

+(-l)"+i 5] Res.=i+,,| 



n! 



[1 - 2-(«-i)]s2(s - l)2(s - 2)2(s - 3) • • • (s - n) 



n! 



fcez\{o} 



[1 - 2-(^-i)]s2(s _ i)2(s _ 2)2(s _ 3) . . . (s _ n) 



^(2i/„ + 3 + 41n2) - _ ln2 - 3) 



-n 



1 



2 In 2 



(^n-l)' + 



1 1 

2 ~ h^ 



rr 1 tt{2) 2 lu 2 1 

+ 2h^ + h^ + T2" " 2 



fcez\{o} 



r(-i-xfc)r(n + i) 

(ln2)xfe(xi-l)r(n-l-Xfc) 



1 2, 3 + ln2-7 2 1 x2 
—n InnH n ; n(lnn) + 

2 2 21n2 ^ ^ 



M _i 

Vln2 2 



n\nn + 0{n). (4.47) 



Therefore, from ()4.42p - (|4.44p and ()4.46p - (|4.47p . we obtain the following asymptotic 
formula for //(m, n): 



^i{m,n) = 4(1 +ln2 - d)n - ^(Inn)^ + 4 ( - 1 ) lnn + 0(1). 



Vln2 



(4.48) 



The asymptotic slope 4(1 + In 2 — a) is approximately 8.20731. 
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5 Derivation of a Closed Formula for fi{m, n) 

The exact expression for fi{m,n) obtained in Section [2] [see (|2.8p ] involves infinite summation 
and integration. Hence it is not a preferable form for numerically computing the expectation. 
In this section, we establish another exact expression for fj.{m, n) that only involves finite 
summation. We also use the formula to compute fi{m, n) for m = 1, . . . ,n, n = 2, . . . , 20. 
As described in Section [21 it follows from equations ()2.6p ~ ()2.8p that 

fi{m,n) = ^i{m,n) + fi2{m,n) + iJ.3{m,n), (5.1) 

where, for g = 1, 2, 3, 

k=0 i=i -■=('-1)2-'= A={«-i)2-'' 



00 2'' „(/_l)2-fe A2-'' 

Hq{m,n) ■.= y^\^ / / {k + l)Pq{s,t,'m,n) dt ds. (5.2) 

Zr'n Js=(i-i)2-k Jt=(i-i)2-k 



The same technique can be applied to eliminate the infinite summation and integration from 
each ^q{m,n). We describe the technique for obtaining a closed expression of /xi(m, n) in 
detail. 

First, we transform Pi{s, t, m, n) shown in ()2.3p so that we can eliminate the integra- 
tion in ^i(m,n). Define 

Ciii,j):=I{l<m<i<j<n}-. '^——(. ^ ^ . ^ ^ ), (5.3) 

J -m + l\t-l,l,j - t-l,l,n- jj 

where /{I < m < i < j < n} is an indicator function that equals 1 if the event in braces 
holds and otherwise. Since 



{t- sy-'-^i-t)''-^ 

u=o V ^ / V / 

it follows that 

j-i-l n-j / • _ ■ _ -|\ / _ A 
Pl(s,t,m,n) = I A,,-u-2^n^,-.+u^_^y-^-u-.-l 

m<i<j<n u=0 v=0 ^ / \ / 

j-2 n-f-2 / • • 1 \ / • \ 

m<i<j<n f=i-l h=j-f-2 / \ J J / 

n~2 n~f~2 

= E E '^t'C2{f,h), (5.4) 

f=m-l h=0 



where 



/+i f+h+2 / • • 1 \ / • \ 



i=m j=f+2 
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Thus, from (j5.2p and ()5.4p . we can eliminate the integration in ^i{m,n) and express it using 
polynomials in /: 

^i(m,n) 

n-2 n-f-2 oo 2* 

f=m-l h=0 k=0 1=1 

(5.5) 



where 



Note that 



(n + l)(/ + l) 



C2{f, h). 
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Hence 
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EE 



/ + A A + 1 



i'=o i=o 
which can be rearranged to 
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(5.6) 



where 
C4if,h,j) 



/i-i+2 {i-i)A/ 

E 

j'=ov(i-i-/i) 



1\ 

2 



Therefore, from ()5.5p - ()5.6p . we obtain 



n-2 n-f-2 oo 2*^ 

^i(m,n) = E E C3(/,/i)^(A: + l)j;2-'^(/+'^+2) ^ C,{f,h,j)p-' 

f=m-l h=0 k=0 1=1 j=l 

n-2 n-f-2 f+h+l oo 2'' 

= E E E c,if,h,j)j2ik+i)2-'(f^'^^^Y.^^-\ 
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where 

C5{f,h,j) :=Cs{f,h)-C4{f,h,j). 
Here, as described in Section 13. H 

1=1 r=0 

where Cij ^ y IS defined by (j3.4p . Now define 

Ceif,h,j,r) := ,,• C^{f,h,j). 



Then 



n-2 n-f-2 f+h+1 j-1 oo 

f=m-l h=0 j=l r=0 k=0 
n-2 n-f-2 f+h+1 j-1 

= E E^6(/>,J,r)[l-2-(W2+-^)]-2 

f=m-l h=0 j=l r=0 
n-1 

= EC^7(a)(l-2-)-2, (5.7) 

a=l 



where 



n-2 n-f-2 f+h+1 

C7{a):= Ce{f,h,j,a + j-{f + h + 2)), 

f=m—l h=a j=P 

in which a := V(a - / - 1) and (3 := 1 V(/ + h + 2-a). 

The procedure described above can be apphed to derive analogous exact formulae 
for ij,2{m,n) and ^■^{m^n). In order to derive the analogous exact formula for fi2{m,n), 
one need only start the derivation by changing the indicator function in Ci{i,j) [see ()5.3p ] 
to /{I <i<m<j<n} and follow each step of the procedure; for ^^{m,n), start the 
derivation by changing the indicator function to /{I < i < j < m < n}. 

Using the closed exact formulae of /ii(m,,n), /i2(?TT.,ra), and /X3(m,n), we computed 
/i(m, n) for n = 2, 3, . . . , 20 and m = 1, 2, . . . , n. Figure [1] shows the results, which suggest 
the following: (i) for fixed n, ^{m, n) increases in m for m < and is symmetric about 
^^yi; (ii) for fixed m, ^{m,n) increases in n (asymptotically linearly). 



6 Discussion 



Our investigation of the bit complexity of Quickselect revealed that the expected number 
of bit comparisons required by Quickselect to find the smallest or largest key from a set 
of n keys is asymptotically linear in n with the asymptotic slope approximately equal to 
5.27938. Hence asymptotically it differs from the expected number of key comparisons to 
achieve the same task only by a constant factor. (The expectation for key comparisons is 
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Expectation of bit comparisons 




Figure 1: Expected number of bit comparisons for Quickselect. The closed formulae for 
fii{m,n), fj,2{m,n), and fi3{m,n) were used to compute iJ,{m,n) for n = 1,2, ...,20 (n 
represents the number of keys) and m = 1,2, ... ,n (m represents the rank of the target key). 
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asymptotically 2n; see Knuth [TO] and Mahmoud et al. [T3]). This result is rather contrastive 
to the Quicksort case in which the expected number of bit comparisons is asymptotically 
n(lnn)(lgn) whereas the expected number of key comparisons is asymptotically 2nlnn (see 
Fill and Janson [3]). Our analysis also showed that the expected number of bit comparisons 
for the average case remains asymptotically linear in n with the lead-order coefficient approx- 
imately equal to 8.20731. Again, the expected number is asymptotically different from that 
of key comparisons for the average case only by a constant factor. (The expected number of 
key comparisons for the average case is asymptotically 3n; see Mahmoud et al. |14j). 

Although we have yet to establish a formula analogous to (13. Sp and (I4.42P for the 
expected number of bit comparisons to find the m-th key for fixed m, we established an 
exact expression that only requires finite summation and used it to obtain the results shown 
in Figure 1. However, the formula remains complex. Written as a single expression, /i(m,n) 
is a seven-fold sum of rather elementary terms with each sum having order n terms (in the 
worst case); in this sense, the running time of the algorithm for computing /Lf(m, n) is of 
order . The expression for fi{m,n) does not allow us to derive an asymptotic formula for 
it or to prove the two intuitively obvious observations described at the end of Section El 
The situation is substantially better for the expected number of key comparisons to find the 
m-th key from a set of n keys; Knuth [TO] showed that the expectation can be written as 
2[n + 3 + (n + - (m + 2)Hm - (n + 3 - m)Hn+i-m]- 

In this paper, we considered independent and uniformly distributed keys in (0,1). 
In this case, each bit in bit strings is 1 with probability 0.5. In our future research, we 
intend to generalize the bit strings and consider each bit resulting from an independent 
Bernoulli trial with parameter p. This generalization will further elucidate the bit complex- 
ity of Quickselect and other algorithms. 

Acknowledgment. We thank Philippe Flajolet, Svante Janson, and Helmut Prodinger 
for helpful discussions. 

7 Appendix 

In order to prove (j3.16p . it suffices to show that, for any positive integer m, 



(note that n > 2 and < ^ < 1). Letting t := — 1 — s, it is thus sufficient to show that 





= 0. 



Using the residue theorem, we obtain 



n 



■2+ioo 



dt 



J = -27:1 ^(-1) 



k\{n + l-k)\ (n + 2)! 



+ 



2—ioo 



C{t)m' 



t(t+l)---[t+(n + l)] 



.fc=o 



(7.1) 
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The "2" in the second term here could just as well be any real number exceeding 1. Here 
' kl{n + l-k)\ 2 {n + 1)1 {k + ly.in + I - k)\ ^kl{n + 2-ky.' 

k — k — ]_ k — ]_ 

Therefore 



/c!(n + l-A;)! (n + 2)! 



m 



-(n+1) 



■^^^ ij,(n+l)! 



(n + 1)! ^fc!(n + 2-fc)! n + 2 

^_(„+l) m-l ^ m-l . 

" (n + 1)! ^ " (^TTl)! ^ ~ m J ' ^^'^^ 

for the second equality, see Knuth [TT] (Exercise 1.2.11.2-4). On the other hand, Flajolet et 
al. [1] showed that 

2+'- dt 2ni f 



2-,,oo ^^'^ t(t+i)---[t+(n+i)] = (^myi ^ " ■ ^ 

Thus it follows from (HHl-dZ^I) that J = 0. 
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